6 



Binary Periodic Synchronizing 
Sequences 



Marcin Skubiszewski 



May 1991 



Publication Notes 

This article will also appear in Theoretical Computer Science, Part A, Volume 99 (October 
1992). 

Author's electronic address: skubi@prl . dec . com 



© Digital Equipment Corporation 1991 



This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission 
to copy in whole or in part without payment of fee is granted for non-profit educational and research 
purposes provided that all such whole or partial copies include the following: a notice that such copying 
is by permission of the Paris Research Laboratory of Digital Equipment Centre Technique Europe, in 
Rueil-Malmaison, France; an acknowledgement of the authors and individual contributors to the work; 
and all applicable portions of the copyright notice. Copying, reproducing, or republishing for any other 
purpose shall require a license with payment of fee to the Paris Research Laboratory. All rights reserved. 

ii 



Abstract 



In this article, we consider words over {0, 1}. The autodistance of such a word is the lowest 
among the Hamming distances between the word and its images by circular permutations other 
than identity; the word's reverse autodistance is the highest among these distances. For each 
I > 2, we study the words of length I whose autodistance and reverse autodistance are close to 
1/2 (we call such words synchronizing sequences). 

We establish, for every I > 3, an upper bound on the autodistance of words of length I. This 
upper bound, called up (I), is very close to 1/2. 

We briefly describe the maximal period linear recurring sequences, a previously known family 
of words over {0, 1}; such words exist for every length of the form I = 2 n — 1 and their 
autodistances achieve the upper bound up (I). 

Examples of words whose autodistance and reverse autodistance are both equal or close to 
up (I) are discussed; we describe the method (based on simulated annealing) which was used 
to find the examples. 

We prove that, for sufficiently large I, an arbitrarily high proportion of words of length I will 
have both their autodistance and reverse autodistance very close to up (I). 



Resume 

Nous considerons dans cet article des mots sur {0, 1}. Nous appelons autodistance d'un 
tel mot la plus petite des distances de Hamming entre lui-meme et ses images par des 
permutations circulaires non identiques; V autodistance inverse du mot designe la plus grande 
de ces distances. Pour tout I > 2, nous etudions les mots de longueur I dont 1' autodistance et 
1' autodistance inverse sont toutes les deux proches de 1/2 (de tels mots seront appeles suites 
synchronisantes). 

Pour tout I > 3, nous etablissons une borne superieure sur 1' autodistance des mots de longueur 
I. Cette borne superieure, notee up (I), est tres proche de 1/2. 

Nous presentons brievement les suites lineairement recurrentes de periode maximale, une 
famiile deja etudiee de mots sur {0, 1}; de tels mots existent pour toute longueur de forme 
I = 2 n — 1 et leur autodistance atteint la borne up (I). 

Nous considerons des exemples de mots dont 1' autodistance et 1' autodistance inverse sont 
toutes les deux proches de up (I) ou egales a cette valeur; nous decrivons la methode (une 
adaptation du recuit simule) qui a permis de trouver ces exemples. 

Nous prouvons que, pour I suffisamment grand, 1' autodistance et 1' autodistance inverse sont 
tres proches de up (I) pour une proportion arbitrairement elevee des mots de longueur I. 
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1 Introduction 

1.1 Subject of the article 

Modern radio techniques, including radar and spread-spectrum communications, make use 
of finite sequences of bits exhibiting various correlation properties (e.g. [5], [2] chapters 10 and 
12, [1]). The correlation properties of a sequence measure how easily it can be distinguished, 
after a transmission with errors, from other related sequences (the notion of related sequences 
is application-dependent). 

We study here two correlation properties, the autodistance and the reverse autodistance. 
The autodistance measures how well, in the worst case, the receiver will be able to distinguish 
between the sequence and a non-identical circular permutation of it (in this case, we consider 
that circular permutations of a sequence are related to it). The reverse autodistance measures 
the difficulty that the receiver will have, in the worst case, distinguishing between the sequence 
and a circular permutation of its one's complement (here, we consider that circular permutations 
of the one's complement of a sequence are related to the sequence). 

In this study, we focus on searching for, and estimating the number of, sequences that 
exhibit a high autodistance (the synchronizing sequences) and those that exhibit both a high 
autodistance and a low reverse autodistance (the double synchronizing sequences). 

1 .2 Contents 

Section 2 of the article introduces the necessary notation and mathematical objects (including 
precise definitions of autodistance and reverse autodistance). 

In Section 3, we investigate which values the autodistance and reverse autodistance can 
attain. We establish, for each length I, an upper bound on the autodistance of sequences of 
this length (Section 3.1); we complete this basic result with several remarks about the reverse 
autodistance of certain classes of sequences (Sections 3.2-3.3). 

In Sections 4-6, we either find, or prove the existence of, sequences whose autodistance and 
reverse autodistance approach the previously established bounds. 

In Section 4, quoting already known results [4], we introduce the maximal period linear 
recurring sequences, a family of double synchronizing sequences which achieve the bounds 
for certain lengths I. 

In Section 5, we describe examples of double synchronizing sequences whose lengths are 
between 3 and 405; these examples achieve, or almost achieve, the bounds. We present a 
computational method, based on simulated annealing, which we used to find the examples. 

In Section 6, we establish a theorem implying that among very long sequences of bits, 
almost all have their autodistances and reverse autodistances close to the respective bounds. 
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2 Definitions and Notation 



2.1 Basic notation 



in j greatest common divisor (GCD) of i, j e N 
[a . . b] interval jzeZ | a < i < b j 

[a . . b) interval jzeZ | a < i < b j 

N2+ set of natural numbers > 2 

{0, 1 } 2+ set of words over {0, 1 } of length > 2 

{0, 1}' for I g N2+, set of words over {0, 1} of length I 

\S\ \E\ length of the word S e {0, 1} 2+ ; cardinality of the set E 

\S\ Q |5|j number of zeros (resp. ones) in S e {0, 1} 2+ 

(FJjgj the family of elements i^, indexed by elements x e X; by definition, K-FaOzexl = l-^l 

| T\ A number of elements of the family T belonging to the set A; if T = (F x ) xeX > then 

\F\ A = \{x & X | F x e A }| 

symmetrical difference between sets: AAB = (Al) B) - (An B) 
xA for a; e R and 4cR, the set j a;j/ | y e 4 j; the definitions of 

A + x and A - x are analogous 

for S g {0, 1} 2+ and 0 < i < \S\, the i-th digit of S 
t p circular permutation by p of words from {0, 1 } 2+ : 

T p (S)[i] = S[(i + p) mod \S\] 

d (S, T) for S, T e {0, 1}', the Hamming distance between S and T: 

d(S,T)= |{ ie [0..Z) | ^ T[i] }| 



2.2 Notation of objects defined in the article 

d (S) for S e {0, 1} 2+ , the autodistance of S (Definition 1 below) 
d'(5) forSe {0, l} 2+ ,the reverse autodistance of S (Definition 2 below) 
up (I) for I g N, I > 3, up (I) = 2 1(1+ 1)/4J (Definition 7 below) 



2.3 Autodistance and synchronizing sequences 

Definition 1 (autodistance) For S g {0, 1} 2+ , the autodistance of S is the minimum of the 
Hamming distances between S and all its images by circular permutations other than identity: 

d(S)= min d(S,T p (S)) 

pe[i..\S\) 
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Definition 2 (reverse autodistance) For S e {0, 1} 2+ , the reverse autodistance of S is the 
maximum of the Hamming distances between S and all its images by circular permutations: 

d'(S)= max d(5,r p (5)) 

pe[0..\S\) 

Examples: The null word of any length satisfies d(5) = d'(5) = 0. The words 001 and 
0 011 satisfy 

d(001) = d'(001) = 2 
d(0011) = 2 
d'(0011) = 4 



Definition 3 (optimal synchronizing sequence) An optimal synchronizing sequence of length 
I e N2+ is a word S e {0, 1}' whose autodistance is maximal; in symbols, S e {0, 1}' is an 
optimal synchronizing sequence if and only if 

V(Te {0,l}')d(T) < d(S) 

Informally we call any word S e {0, 1}' whose autodistance is maximal or nearly maximal 
a synchronizing sequence of length I. 

Definition 4 (double-optimal synchronizing sequence) A double-optimal synchronizing se- 
quence of length I g N2+ is a word S e {0, 1}' whose autodistance is maximal, and whose 
reverse autodistance is minimal among all words in {0, 1}' having the maximal autodistance; 
in symbols, S e {0, 1}' is a double-optimal synchronizing sequence if and only if 

V(T g {0, 1}') d (T)< d (S) v (d (T) = d (S) a d' (T) > d' (S)) 

Informally, any word S e {0, 1}' whose autodistance is maximal or nearly maximal and 
whose reverse autodistance is, among the words having the same autodistance as S, minimal 
or nearly minimal, will be called a double synchronizing sequence of length I. 

Definition 5 (uniform sequence) A uniform sequence is a word S e {0, 1} 2+ such that 

d(5) = d'(5) 

It follows from Definitions 1 and 2 above that the sequence S e {0, 1} 2+ is uniform if 
and only if the number d (S, t(S)), where r is a non-identical circular permutation, does not 
depend on the choice of r. 

Examples: The null word of any length is a uniform sequence. A word of any length 
containing a unique 1 and having all other digits equal to 0 is a uniform sequence. 
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Definition 6 (uniform optimal synchronizing sequence) A word from {0, 1} + is a uniform 
optimal synchronizing sequence if it is a uniform sequence and an optimal synchronizing 
sequence. 

Informally, any word from {0, 1} 2+ which is both a uniform sequence and a synchronizing 
sequence will be called a uniform synchronizing sequence. 

It follows from the definitions above that a uniform optimal synchronizing sequence is also 
a double-optimal synchronizing sequence. 

Example: The word 0 01 is a uniform optimal synchronizing sequence. Long optimal 
synchronizing sequences are never trivial. 

3 Bounds on Synchronizing Sequence Characteristics 

Theorem 1 below establishes an upper bound on the autodistances of synchronizing 
sequences. Theorems 2 and 3 establish that uniform synchronizing sequences of certain forms 
do not exist. Theorem 4 states that all optimal synchronizing sequences in a certain category 
are uniform. 

3.1 An upper bound on the autodistance 

Theorem 1 (an upper bound on the autodistance) For every I e N, I > 3, the autodistance 
of every word S e {0, 1}' is less than or equal to the value given in the following table (for 
n e Z): 



l = \S\ 


d(S) 


An 


2n 


4n+ 1 


2n 


4n + 2 


2n 


4n + 3 


2n + 2 



Definition 7 (up (1)) For every I > 3, the upper bound given in the table in Theorem 1 will be 
denoted up (I). 

In order to prove the theorem, let us establish two lemmas. 

Lemma 1 (parity of d (S)) The autodistance of every word S e {0, 1} + is even. 

Proof: By Definition 1, for some p e N we have d(S) = d (S,t p (S)). It is therefore 
sufficient to prove that the Hamming distance between a word S e {0, 1} 2+ and any of its 
circular permutations is even. 
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Let T be a circular permutation of 5. We define, for x, y e {0, 1}, the four sets 
A xy = {ie [0 .. |5|) | S[i] = x a T[i] = y) 
which trivially have the following properties: 

151! = \A w \ + \A n \ 
\T\ l = \A 0l \ + \A n \ 
d(5,T) = \A 0 i\ + \A m \ 
These equations, together with the fact that 15^ = |T| 1; imply 

d(5,T) = 2|Aoi| 

so d (5, T) is even. □ 

Lemma 2 (a weaker version of Theorem 1) For I > 3, the autodistance of every word 
5 e {0, 1}' is less than or equal to \l/2]. 

Proof: Let 5 e {0, 1}'. We define for i e [0 .. I) and x e {0, 1}: 
N x [t]= |{ pe [0..Z) | r p (S)[i] = x }| 

By definition of r p (5), 

N x [i] = |{ p e [0 .. 0 | + p) mod /] = z }| 

and, regardless of z, 

^[^=151, (1) 
Let us define the total autodistance of 5, called if, as 

2-1 

K = J2d(S,r p (S)) (2) 
P =o 

By definition of d (5, T), K satisfies: 

2-1 

K = £|{ ie [0..Z) | 5[i] * r p (5)[i] }| 
= |{ (p,*)e [0..Z) 2 | S[i] * r p (S)[i] }| 



I— 1 

£|{ pe [0..Z) | * r p (5)[i] }| 



»=o 



= ^ JV lL i] + ^ #<,[*] 

ie[0..2) ie[0..2) 

S[i]=0 S[i]=l 

= E E l^lo (by(D) 

is [0..2) ie[0..2) 

S[i]=0 S[i]=l 

if = 2 l-S-lo l-S-li (3) 
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The autodistance of S is, by its definition, the minimum of the family (d {S,t p (S))) ^ ly 
Let us define the average autodistance of S, called M, as the average of the same family: 

M = £';,'d(^)) w 

This definition implies that M > d (S). 

Equations (2) and (4) and the fact that d (S, tq(S)) = 0, lead to the following expression 
of M: 

K 



M 



I- 1 



M = 2 Wo\S\i (by(3)) (5) 



If I is even, M is maximal for |5| 0 = \ S\ l = 1/2, and we have, 

2(//2)(//2) 



M < 



I- 1 



M < U ' 



2 2(1-1//) 
Since I > 3, 

M<^ + 1 

Since d (5) < Mandd(5)G Z, 

d(5)<i 

and the lemma holds for I even. 
If I is odd, M is maximal for \S\ Q - (I - l)/2 and |5|j = (7 + l)/2. We have therefore, 

M < 2(//2+l/2)(//2- 1/2) 

Z + l 

M < — (6) 

Then, 

d(5)< [//21 

and the lemma holds for Z odd. □ 



Proof of Theorem 1: Lemma 2 implies that, for I > 3, no word can have an autodistance 
greater than the value d (S) listed in the table below: 



/ = |5| 


d(S) 


An 


2n 


4n+ 1 


2n+ 1 


4n + 2 


2n+ 1 


4n + 3 


2n + 2 
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Lemma 1 says that no word can have an autodistance of the form 2n + 1 , which makes us 
deduce the table in Theorem 1 from the one above. □ 

3.2 Non existence of certain uniform sequences 

Lemma 3 (domain of d' (5)) For any word 5 e {0, 1}' , I e N2+, the reverse autodistance of 
S is even and satisfies 

d(5) <d'(S)<l (7) 

Proof: Substituting d' (5) for d (5) in the proof of Lemma 1 gives the evenness of d' (5). 
Relation (7) results directly from the definitions of autodistance and reverse autodistance. □ 



Theorem 2 (nontrivial uniform sequences for I - 1 prime) Let I e N2+ and let I - 1 be 

prime. Then among the words 5 e {0, 1}', exactly those verifying one of the conditions 



|5| 0 = 0 (8) 

|S1 0 = 1 (9) 

|5| 0 = / (10) 

|5| 0 = 1-1 (11) 



are uniform sequences. 



Proof: The reader may easily verify the fact that each of the conditions (8)-(ll) implies 
that S is a uniform sequence. 

Supposing that Z — 1 is prime and that S e {0, 1}' is a uniform sequence, let us prove that 
one of relations (8)-(l 1) holds. From the definitions of autodistance and reverse autodistance, 
we get 

V (p e [1 .. /)) d (S) < d {S, t p (S)) < d' (S) 

which implies that M, the average autodistance of S defined as in the proof of Lemma 2, 
relation (4), satisfies 

d(S) <M< d'(S) 
Since d (S) = d' (S), we successively get 

M = d(S) 

M g 2N (from Lemma (1)) 

2 151 151 
' 10 ' 11 g 2N (from (5)) 

|5| 0 (1-|5| 0 ) g (Z-l)N 
1 5 1 q g (Z-l)N or (1-|5| 0 )g (Z-l)N (since / - 1 is prime) (12) 

Relation (12) implies that one of the conditions (8)-(l 1) holds. □ 
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Theorem 3 (uniform optimal synchronizing sequences) Let I e N2+. If one of the following 
holds 

i. I — An where n e N and ^/n £ N. 

ii. I = An + 1 where n e N and \/8n + 1 g N. 
I - An + 2 where n e N awe? \/3n + 1 g N. 

then no uniform sequence S e {0, 1}' will satisfy the equality d (S) = up (Z). 

Proof: Suppose that S e {0, 1}' is a uniform sequence with d (S) = d' (S) = up (Z). Then, 
reasoning as in the proof of Theorem 2, we can say that M, the average autodistance of S, 
satisfies 

M = d(S) 

which, by (5), translates into 

2|5| 0 (/- \S\ 0 ) = (l- l)up(Z) (13) 

If (i) holds, then I = An, and (13) becomes 

|S|o-4n|S| 0 + 4n 2 -n = 0 
Solving this second degree equation in \S\ 0 , we deduce that (13) is equivalent to 

\S\ 0 -2n + ^/n or \S\ 0 -2n — ^/n 
which is impossible since *Jn <£ N. 
If (ii) holds, then (13) becomes 

|5|o-(4n+l)|5| 0 + 4n 2 = 0 
|5| 0 = ^ (An + 1 + VSn+ l) or |5| 0 = ^ (An + 1 - V8n+ l) (14) 

Recalling that the square root of a natural number is either natural or irrational, we 
deduce that V8n+1 is irrational. Therefore, the alternative (14) implies that \S\ 0 is 
irrational, which is impossible. 

If (iii) holds, then (13) becomes 

|5|o-2(2n + l)|5| 0 + (4n+l)n = 0 

\S\ 0 = 2n + \ + V3n + \ or \S\ 0 = 2n+ 1 - V3n+ 1 (15) 

which is impossible since \/3n + 1 £ N. □ 



May 1991 



Digital PRL 



Binary Periodic Synchronizing Sequences 



9 



3.3 Uniformity of certain sequences 

Theorem 4 (certain sequences are uniform) For I -4n + 3,n e N, every word from {0, 1}' 
whose autodistance is equal to up (I), is a uniform optimal synchronizing sequence. 

Theorem 5 below says that sequences satisfying the hypotheses of Theorem 4 exist for 
I = 2 n - l,n6 N2+. In Section 5.2 (Figure 2 and Table 1) examples of sequences are quoted 
for Z = 3,7, 11, 15, 19,23,31,35. 

Proof of Theorem 4: Let S satisfy the hypotheses of the theorem. Then S is, by Theorem 1 
and by the definition of up (I), an optimal synchronizing sequence. 

Let us prove that S is a uniform sequence. We use M, as defined by equation (4) in the proof 
of Lemma 2. Since I is odd, we can, as in the proof of Lemma 2, obtain inequality (6). This 
inequality and the fact that d(S) = ^y- imply that M < d(S). Since M is, by its definition, 
greater than or equal to d (S), we get 

M=d(S) 

The average and the minimum of the finite family of integers (d (S, T p(S))) p€ ^ ^ are then 
equal. All the numbers in the family are therefore equal and d' (S) = d (S). □ 

4 Maximal Period Linear Recurring Sequences 

Theorem 5 (up (I) is optimal for I = 2 n - 1) For every I of the form I = 2"-l,ne N2 + , 
there exists a word S n e {0, 1}' verifying 

d(S n ) = d'(S n ) = up(Z) (16) 

Since this theorem is a straightforward corollary of known results, we will not quote the 
proof in its entirety. Instead, we only describe a way to construct the sequence S n . The 
proof that this construction is correct and that the resulting S n satisfies relation (16) is a direct 
consequence of well-known results from the theory of finite fields (see e. g. [4] , paragraphs 2.11, 
6.32, 6.33 and 7.44). The construction itself is discussed in detail by Sarwate and Pursley ([7], 
Section 3). 

Construction: Let GF2 denote the Galois field of order 2 (i.e. the field composed of 
elements 0 and 1) and GF2LY] denote the ring of polynomials over GF2. 

For every n e N2+, there exists in GF2[X] at least one primitive polynomial of degree n 
(see [4], 2.11). Let us choose one such polynomial and call it P n ; the coefficients of P n will 
be called po, • • • ,Pn (withp n = 1): 

P n (X) = p 0 +p l X + ---+ Pn X n 
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P n can be used as the characteristic polynomial to build an infinite linear feedback sequence 
of bits S' n . To build S' n , we arbitrarily choose its first n bits 5^[0], . . . , S' n [n — 1], with the 
only restriction that these bits may not be all equal to 0 (this gives us 2 n - 1 different choices 
of S' n ). Then, we define the other bits of S' n by the recurrence formula 

0 = PoS' n [i}+ Pl S' n [i + \] + ---+p n S' n [i + n] (foranyieN) (17) 
which translates into 

S^i + n]= P0 S^[i] + p 1 S^i + l] + ---+p n - 1 SUi + n-l] (for any * e N) (18) 

The sequence S' n is periodic and its least period is I = 2 n — 1 (see [4], 6.33). We define S n to 
be the left factor of S' n of length I (therefore S n represents one period of S^). S n satisfies (16) 
(see [4], 7.44). 

Consequences of the theorem: Theorem 5 implies that for all values I of the form 2 n - 1, 
the upper bound up (I) is achieved by some word from {0, 1}'. For these values of I the upper 
bound up (I) can therefore not be improved. 

The results presented in the remainder of this article imply that, in fact, the upper bound 
up (I) is optimal or nearly optimal for any length I. 

5 Example Double Synchronizing Sequences 

5.1 How the examples have been found 

Simulated annealing, the technique used here to find double synchronizing sequences, was 
first described by Kirkpatrick et al. [3]. Let us describe briefly both the technique and the way 
in which it has been adapted to our problem. 

Simulated annealing is an optimization algorithm. It provides approximate solutions to 
difficult problems (i.e. to problems for which finding the global optimum would involve an 
extremely long computing time). More precisely, for a set X, on which is defined a function, 
called energy, £ : X — > R, simulated annealing will try to find an element x e X such that 
£(x) be as low as possible. 

In our case, the algorithm is run separately for each value of I and we have X = {0, 1 }' . When 
searching for synchronizing sequences, we try to maximize d(a;); therefore £(x) = — d(x). 
When searching for double synchronizing sequences, we try both to maximize d (x) and to 
minimize d' (x). In this case, the choice of £ is not obvious; after experimentation, the author 
chose £(x) = d' (x) - 3d (x), although various other formulas apparently lead to identical 
results. 

Simulated annealing requires that for every x e X, a set of neighbors Af(x) be defined. 
Intuitively, x and y are neighbors (i.e. y e M(x)) if they are similar in a way implying that 
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£(x) & S(y). In our case, we consider that two words from {0, 1} are neighbors if their 
Hamming distance is equal to 0 or 1. For the two energy functions mentionned above, this 
implies that if y e N(x), then respectively \£{x) - £{y)\ < 2 or \£(x) - £{y)\ < 8. 

The simulated annealing algorithm is a loop composed of a high number of similar steps. 
In each step, the algorithm tries to update the current solution x e X. To do so, it randomly 
chooses a solution y e N(x). Then, if y is better than x (i.e. £(y) < £(x)), y replaces x 
and becomes the current solution. Otherwise (i.e. if £(y) > £(x)) one of two possibilities is 
randomly selected: either, with probability p = e ( - £ ^~ £( - y ^^ 8 , y replaces x and becomes the 
current solution or, with probability 1 - p, x remains the current solution and y is discarded. 

The current solution x present after the last step is output by the algorithm to be considered 
as its result. 

The parameter 6 is a positive real number, called temperature; it decreases slowly during 
the computation from a problem-dependent initial value to zero. Note that for 6 very high, the 
algorithm reduces to randomly walking through the search space X, regardless of the energy 
function (because for 6 high, always p & 1); for 6 & 0, the algorithm descends quickly towards 
a local minimum of £. For intermediate values of 6, the algorithm randomly walks through 
X, visiting more frequently elements x with £(x) low. 

5.2 What we can learn from the examples 

The curve on Fig. 1 (and its magnified version, Fig. 2) shows, for each I e [3 ..405], the 
autodistance and the reverse autodistance of the best double synchronizing sequence found for 
the length I by simulated annealing. The autodistance can be compared to up (I), also shown 
on the figures. Table 1 reproduces part of these results. 

5.2. 1 The autodistance 

For 3 < I < 42, the autodistance of the examples is, with the exceptions of I = 27 and I = 39, 
equal to up (I). For the particular cases of I = 27 and I = 39, exhaustive searches showed that 
there are no synchronizing sequences with autodistance equal to up(Z) 1 ; the examples found 
for these two values of I are therefore optimal. 

We are thus certain that, for I < 42 (as well as for I - 45, 46, 49, 50, 54, see Fig. 2), 
the simulated annealing program actually found optimal synchronizing sequences. For these 
values, with the exceptions of I - 27 and I - 39, the upper bound of Theorem 1 is exact. 
For I - 27 and I = 39, the maximal autodistance is less than up (I), and Theorem 1 could be 
improved to take this fact into account. 

According to Theorem 5, for lengths of the form I - 2 n — 1, some sequences achieve the 
upper bound up(Z). Therefore, for I - 63, 127,255, the simulated annealing program found 



'For I = 39, the exhaustive search was performed by Mark Shand [8] using a carefully optimized search 
program. 
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Figure 1 : Autodistance and reverse autodistance of example sequences as a function of their 
lengths I. The lower line shows the autodistance of the best double synchronizing sequence 
found by simulated annealing for each length. The upper, dotted line shows the reverse 
autodistance of the same sequences. The middle, perfectly regular line shows up (I). 



30 















A / FT 






.../""// 


/ // 













20 



10 



10 



20 



30 



40 



50 



Figure 2: A fragment of the curves from Fig. 1, magnified. 
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l = \S\ 


d(S) 


d'(S) 


S 


3 


2 


2 


100 


4 


2 


2 


0100 


5 


2 


2 


01000 


6 


2 


2 


000100 


7 


4 


4 


1110100 


8 


4 


6 


11100010 


9 


4 


6 


110000010 


10 


4 


6 


0000010110 


11 


6 


6 


10001001011 


12 


6 


8 


111001100101 


13 


6 


6 


1000000101001 


14 


6 


8 


11100100010000 


15 


8 


8 


000100110101111 


16 


8 


10 


1101110000011010 


17 


8 


10 


11001101101010001 


18 


8 


10 


110010110010000101 


19 


10 


10 


1001111010100001100 


20 


10 


12 


01000011011011000101 


21 


10 


12 


011110000100101110110 


22 


10 


12 


0100001010001001111011 


23 


12 


12 


00000101001100110101111 


24 


12 


14 


100011110110110000010101 


25 


12 


14 


1011000110000000101110100 


26 


12 


14 


10010100111110001000100010 


27 


12 


14 


110100010111001100000000010 


28 


14 


16 


0111001111110100100110101000 


29 


14 


16 


00000101100111111001010011101 


30 


14 


16 


111001100101101010111000111111 


31 


16 


16 


1111011010011000001110010001010 


32 


16 


18 


00010001011001000111011010111100 


33 


16 


18 


100100111000111011101000010000101 


34 


16 


18 


1010001111011010010011001100000010 


35 


18 


18 


00000111000101101100101011110110001 


36 


18 


20 


100010011110111100001011010001011000 


37 


18 


20 


0011011010111010001100001000110111101 


38 


18 


20 


01010001000000011001111000110110100001 


39 


18 


20 


010010110101110011100000011101000100010 


40 


20 


24 


0001000011101000110100110011010110110111 


41 


20 


22 


00011101011111000001001010000100110110001 


42 


20 


22 


111111010000001000100110001010010010111000 


43 


20 


22 


1110110001010111100100111101001110010111011 


44 


20 


22 


11110110100111111100111110101010011001001110 


45 


22 


26 


001000110001101000101110001011010011011111101 


46 


22 


26 


1011010110111010010001000111110001110010010111 


47 


22 


26 


01111010101000101101011000001100010011110011011 


48 


22 


26 


011011000110001010101110010010111101000000011000 


49 


24 


28 


0100001101011101111110110000011100110110000101010 


50 


24 


28 


11000010110111001010011001101110101110000100000110 



Table 1: Examples of synchronizing sequences. 
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only sub-optimal synchronizing sequences. 

For I = 43, 44, 48, by systematically searching through a significant fraction of {0, 1}', Mark 
Shand [8] found words achieving up (7); the best examples found by simulated annealing for 
these values of I are therefore non-optimal. 

For all values of I not mentionned above, we do not know whether the synchronizing 
sequences found using simulated annealing are optimal; we do not know, either, whether up (I) 
is the exact upper bound for those values. Unlike for I < 44, the exhaustive search, which 
costs 0(2') in time, cannot be applied to answer these questions. 

5.2.2 The reverse autodistance of optimal synchronizing sequences 

Lemma 3 and Theorem 3 imply that the examples found for I e {3 .. 15, 17 .. 21, 23, 24, 26, 
28 .. 33, 35, 37, 42} are double-optimal synchronizing sequences. 

As indicated in Section 5.2.1 above, for I = 21 there are no words S e {0, 1}' with 
d (S) = up (1); a computation analogous to the these in the proof of Theorem 3 shows that there 
is also no word of this length with d (S) = d' (S) = up (I) — 2. Therefore, the corresponding 
example is a double-optimal synchronizing sequence. 

For I = 16, 22, 25, exhaustive searches showed that there is no word S e {0, 1}' satisfying 
d (S) = d' (S) = up (7); the corresponding examples are therefore double-optimal synchronizing 
sequences. 

For I g {34, 36, 38, 40, 41, 45, 46, 49, 50, 54}, the examples found are optimal 
synchronizing sequences, but the author has not been able to establish whether they are 
double-optimal. 

6 Double Synchronizing Sequences of Length / — > +oo 

6.1 The result 

Theorem 6 (double synchronizing sequences for large I) Let aeR, 0 < a < 1. There 
exists a function e : N2 + — > R+ such that lim+co e = 0 and that for every I e N, I > 3, there are 
at least a2 l distinct words S e {0, 1}' satisfying 

up (/) - le(l) < d (S) < d' (5) < up (I) + le(l) 

6.2 How the proof is organized 

The proof of Theorem 6 is long. Let us summarize it here. 

Section 6.3 states two capital lemmas from which the theorem directly results. 

Section 6.4 defines several notational conventions. 
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Section 6.5 contains auxiliary lemmas, which recall generally known mathematical facts. 
Sections 6.6-6.9 contain the proof of the first capital lemma. 

In Section 6.6, we choose a function e which, as we will prove, satisfies both capital lemmas 
(and thus the theorem). We define then the set E C {0, 1}' of words whose autodistance is 
less than up (I) - e(l)l, and we represent it as equal to the union of a family of sets called E v p. 

Then, in Sections 6.7 and 6.8, we establish intermediate results which will enable us to 
estimate the cardinals of the sets E v p. Finally, in Section 6.9, we use these results to prove 
that \E\ < ^f L 2 l , from what the first capital lemma results. 

In Section 6.10, rather than fully describing the proof of the second capital lemma, we 
simply indicate in which ways it differs from the proof of the first capital lemma. 

6.3 The two capital lemmas 

Theorem 6 follows in a straightforward way from the two following lemmas. 

Capital Lemma 1 (autodistance for high I) Let a e R, 0 < a < 1, There exists a function 
e : N2+ — > R+ such that lim +00 e = 0 and for every I e N, I > 3, there are at most ^ SL 2 l 
distinct words S e {0, 1}' such that 



Capital Lemma 2 (reverse autodistance for high I) Let a e R, 0 < a < 1. There exists a 
function e : N2+ — > R+ such that Iim +O o£ - 0 and for every I e N, I > 3, there are at most 
distinct words S e {0, 1}' such that 



6.4 Conventions 

We make, for the whole proof, the following assumptions about the numbers I, p, a, b and fi 
and about the sets D and P: 



d (S)< up (/) -/e(/) 



up (/) + /£(/) < d'(S) 



I e N 2+ , 

pe Z, 
a e N, 
6g N, 
H g R, 



0 < n < 1/2 



1 < a 
2<b 



3 < I 

1 < p < 1/2 



D C [0..J) 
P C Z, 



P is a finite set 
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These assumptions are valid in lemmas and auxiliary definitions which are part of the proof. 
They will not be recalled there. For instance, the following 

Example Lemma 1 For every /x e R such that 0 < /x < 1/2 and for every n e Z, /x ■£ n. 
will be abbreviated to 

Example Lemma 2 For every n e Z, /x t- n. 
6.5 Auxiliary lemmas 

Lemma 4 (approximation of ) ) For every n,de N, 

d < (1/2 - /x) n - 1 ?m/?/;<?j < 2 n e"^ n 

Proof outline: Let us define <? = [(1/2 - /x/2)nj. Using the well-known equality = 
, we can then state the following: 

V(re [d..q)) (?) < 

(3) * 

(3) <- 
(3) * 

□ 

Auxiliary Definition 8 (families J^) For z e [O..Znp) and x e [o..^), we define the 
numbers 

Fi x = (xp + i) mod I 

which form the families 

Fi = (F ix ) 0 < x< _i_ 

The numbers Fi x and the families Ti depend on the numbers I and p but, for simplicity, I 
and p do not appear as indices in their notation. 

Lemma 5 (fundamental property of T{) For every i e [0 .. I n p), the family T{ contains 
exactly once every element of the set A{ = ((I l~l p)Z + i) n [0 . . I) and contains only elements 
of this set. 
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Proof outline: We call Im^) the image of the family Ti, namely 

Im(^) ={F ix \0<x<j^} 

For every x,y & |o . . j^j , the equation F{ x = F{ y is equivalent to 



V I 
3(k e Z), (x - y)-r— = k- 



lr\p lr\p 

which, thanks to the Gauss theorem [6], implies x - y e ^Z. Since — jp^ < x - y < j^, 
we get x = y. All the elements of the family T{ are therefore distinct and the family contains 
every element of A{ at most once. 

Since all the elements of Ti are distinct, the set Im^j) contains ^ elements; A{ and 
Im^j) have therefore the same number of elements. Since, as the reader may easily verify, 
Im^j) C A{, we get Im^) = A{. The family T{ contains then each element of A{ at least 
once and contains no elements from outside A{. □ 

Lemma 6 (parity of the cardinal) If A and B are finite sets, \A/\B\ has the same parity as 
\A\ + \B\. In other words, 

\AAB\ = \A\ + \B\ (mod 2) 

The proof is left to the reader. 

6.6 The sets E PtD 

Let a be denned as in Capital Lemma 1 . We define then 

^ = (hT7 + 7 lnnn T^j 

e'(l) = ,,'(/) + ^ + j 



£(/) = 



e'(/) if e'(/) < 1 /2 and p'(t) < 1 /2 
1 otherwise 



The functions fi', e' and e are then strictly positive, and satisfy 

Iim +O o fi' = 0 
lim+oo s' = 0 
lim+oo £ = 0 

(the easy, computational proofs of these facts are not reproduced here) 

To prove Capital Lemma 1, it is now sufficient to establish, for every I, the property that 
there are at most ^2 l distinct words S e {0, 1}' such that d (5) < up (I) - le(l). 
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For I such that e'(l) > 1/2 or /z'(7) > 1 /2, we have e(l) = 1 and the property trivially holds. 
We suppose therefore, for the rest of the proof, that e'(l) < 1/2 and that fi'(l) < 1 /2, and we 
establish the property in this case. 

Define 

6 = up (/)-/£(/) (19) 
E = { 5e {0, 1}' | d(S) <8} (20) 

The property to be proven can then be expressed by the relation 

\E\ < ] —^-2 l (21) 
i i _ 2 

By Definition 1, equation (20) can be rewritten as 

E = { Ss {0,1}' | 3( ge [l..l))d{S,r q (S)) < S } (22) 
From the definition of the Hamming distance, it is easy to show that for every q e Z and every 

d{S,T q (S))=d{S,n_ q (S)) 

and (22) is equivalent to 

£ = { 5e{0,l} ( | 3(pe [1.. [l/2\]) d {S,r p (S)) < S } (23) 

We then define 

E p = { S g {0, 1}' | d {S,r p (S)) < S } (24) 
Relation (23) can then be rewritten 

L'/2J 

E = |J E p (25) 
P =\ 

Let us define, for S e {0, 1}', the set of differences Ds, p - 

Ds, P = { i e [0..Q | * t p (5)[*] } (26) 
£>s, P = { *e [0..0 | S[i] * S[(i + p) mod I] } (27) 

and, for any Z), let 

E PtD = { 5 e {0, 1}' | Z>5, P = £ } (28) 
Then (24) may be rewritten as 

E p= U ^ (29) 

|L>|<* 

From equations (25) and (29), we can deduce 

L'/2J 

1^1 < E E i^i ( 3 °) 

P=l |D|<« 

The rest of this proof consists in bounding the number of terms in this sum and in estimating 
\E Pi d\ as a function of I, p and D. 
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6.7 More auxiliary lemmas 

Auxiliary Definition 9 (functions <p[i] and expression f(D,i, j)) For i e N, let us define 
the functions <p, <p[i] : {0, 1} — > {0, 1} 

(p(x) = 1 — x 
<p[0] (x) - a; 
<^[z + 1] (x) — (p[i] o (p(x) 

For i e [0 . . I n p) and j & |o . . ^ j , we <ie^«e 

f(D,i,j)= (F ix ) 0 < x<j 



D 



The expression f(D,i,j) depends on I and p, which, for simplicity, do not appear there as 
indices. 



Lemma 7 (relation between S[i], S[j] and -Ds, P ) Let S e E Pj d- Then, for 0 < i < I n p 
and 0 < j < j^, we have 

S[(i+pj) mod /] = <p[f(D,i, j)] (S[i]) 

Proof: First, observe that for n even, ip[n] (x) = x and for n odd, ip[n] (x) = 1 — x. 

We will prove the lemma by induction on j ; the verification that the lemma holds for j = 0 
is left to the reader. 

Let us assume the lemma true for (with 0 < j < j^) and prove it for j + I. Under the 
lemma's hypotheses, the fact that S e E v p (which implies D = Ds, p ) and relation (27) let us 
state: 

if (i + pj) mod I e D, S[(i+ p(j + 1)) mod I] = 1 - S[(i + pj) mod I] 
otherwise, S[(i + p(j + 1)) mod I] = S[(i + pj) mod I] 

which may be expressed as follows 

S[(i + p(j + I)) mod I] = <p[\D n {(i + pj) mod l}\] (S[(i + pj) mod /]) 

= <p[f(D,i, j + 1) - f(D,i, j)] (S[(i + pj) mod I]) 
= ¥>[/(!>,*, j + 1) - f(D, i, j)] (<p[f(D,i, j)] (S[i])) 

S[(i + p(j + I)) mod I] = v[f(D,i,j + l)](S\i]) 

□ 



Lemma 8 (some E Pt r> are empty) If, for some i e [0 .. I n p), the number n p)Z + i)n5| 
w orfrf, then E Pt £> = 0. 
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Proof: Let i e [0 .. Z n p) and let |((Z n p)Z + z) n D)| be odd. By Lemma 5, we get 



I-^Id = |(((/np)Z + *)n [0../)) n D\ 
= |((Z n p)Z + *) n D\ 

\Fi\ D is then odd. Applying Lemma 7, for every S e .Ej,^ we get then 

/ 



S[i] = S 



S[i] = ip 



i + p- 

inp 



mod Z 

(S[H) 



In p 

S[t\ = ip[\Fi\ D ](S\i]) 

S[i] = 1 - S[i] (since | T{ \ D is odd) 



which is impossible. Therefore, S e E Pt £> is true for no S and .Ej,^ = 0. □ 

Lemma 9 has at most 2 lnp members) For every S' e {0, l}' np , there exists at most 

one S such that S e E Pt D and the leftmost factor of S of length I Hp is equal to S'. 

Proof: Let S e -E Pi £>andA:e [0..Z). Let i be the remainder in the division of k by I rip. Since 
0 < i < I n p and k e ((I n p)Z + z) n [0 .. Z), Lemma 5 implies that for some je [o .. ^) , 
we have k = (i + pj) mod Z. We can then apply Lemma 7 to get: 

S[k] = <p[f(D,i,j)US\i]) 

This formula shows that every bit in S can be determined as a function of I, p, D and one of 
the I l~l p leftmost bits of S. Therefore, for any given values of I, p and D, the left factor of S 
of length I n p uniquely determines 5. □ 

6.8 The sets V dtP 

For any d e N, let us define 

£> d , p = { D | |D| < dAE PiD * 0 } (31) 

(the set X^j, depends on Z, but for simplicity Z will not appear as an index in its notation) 
We can rewrite equation (30) as follows: 

L«/2J 

\ E \ < E E \ E pM (32) 
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By Lemma 9, for every D e Vg !P , \E Pi d\ < 2 lnp and equation (32) implies 

L'/2J 

\E\ < E |^, P |2' np (33) 
P =i 

Let us find two different (and both useful) upper bounds on \ Vg iP \. 

Since Vg iP is only composed of subsets of [0 .. I) containing less than 6 elements, we get 

\V'*\< E (1) 04) 

0<a:<$ 

and we can easily verify that the hypotheses of Lemma 4 hold (for /x = //'(/)); in this way we 
get the first upper bound on Vg iP 

\Vs, P \ < /e-"' (,)31 2' (35) 

Let us compute the second upper bound on X>$ iP . To simplify notation, we define the two 
intervals 7 and J: 

/ = [0..a(6-l» 
J = [a(b-l)..ab) 

Auxiliary Definition 10 (sets V dabP ) For every d e N, let 

denote the set of sets D' C [0..ab) such that \D'\ < d and, for every i e [0..a), 
\D' n (aZ + z)| + |P n (aZ + z)| w even. 



Lemma 10 For every d g N, 



^d,a,b,P 



< 2 a(b ~ l) (36) 



Proof: Since the set I has a(6 — 1) elements, there are at most 2 a(b ~ l) possible sets of the form 
D' n I. To prove the lemma, it will therefore suffice to establish that for fixed d, a, b and P, 
and under the condition that D' e V' d a b p , the set D' PI 7 uniquely determines 7)'. 

For every z, 7>' n (aZ + z) is the disjoint union of D' n 7 n (aZ + z) and fl'n Jfl (aZ + i), 
therefore 

|7>' n (aZ + z')| = 1 7)' n 7 n (aZ + i)\ + \D' n J n (aZ + z')| 

The number 

1 7)' n 7 n (aZ + z')| + 1 7)' n J n (aZ + i)\ + \P n (aZ + z')| 
is therefore even. The parity of \D' PI J PI (aZ + z')| is hence determined by 7)' P 7. 



Research Report No. 6 



May 1991 



22 



Marcin Skubiszewski 



On the other hand, we have 

J n (aZ + i) = {i + a(b - 1)} 

Therefore, 

\D' n J n (aZ + z')| even implies D' n J n (aZ + z) = 0 
|D' (1 J (1 (aZ + z)| odd implies fl'n Jn (aZ + + a(b - 1)} 

In this way, D' n I uniquely determines D' n J PI (aZ + z) for every z. The (easy to verify) 
equality 

a-l 

D' = (D'ni)u \J (fl'nJn (aZ + *)) 
»=o 

implies then that D' n I uniquely determines D'. □ 



Lemma 11 For every d such that 0 < d < (1/2 - /x)a6 - 6, 

V d,a,b,P < ae 2 



(37) 



Proof: For every value of a, we will prove the lemma by induction on b. 

First, we need to verify the lemma for b - 2. This verification, when fully described, is 
extremely long. For this reason, we will omit here numerous computational details. 

For any fixed d, a and P satisfying lemma's hypotheses and for b = 2, we consider D' as a 
variable satisfying D' e ^ o2P and we estimate the number of values that D' can take (this 

number is obviously equal to V' d a 2 P ). 
We define the sets U and V: 

U = { * e 7 
V = { * g 7 



|P n (aZ + z')| e 2Z } 
|P n (aZ + z')| g 2Z | 



It is easy to see that if \ V\ > d, then ^ a 2P"^ an d the lemma holds. We suppose therefore 
that | V | < d and verify the lemma in this case only. 

Let us quote the following, easy to establish, relations: 

UUV = [0..a) 

unv = 0 

\U\+\V\ = a 
U +a C J 
V + a C J 
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Let i e V. The cardinal of D' n (aZ + i) is then odd and, since D' n (aZ + z) C {z, a + z'}, we 
get 

a + i e D' z g D' 

i£ D' — a i € D' (38) 

From here, we can deduce that 

Vn(D'-a) = V-D' 

(V + a)nD' = (V-D') + a (39) 

Therefore, V H D' uniquely determines (V + a) n -D'. By remarking that V PI Z)' can take at 
most 2l y l different values, we conclude that (V U (V + a)) n -D' can only take 2l y l different 
values. 

From relation (39) we get 

\(V + a)nD'\ + \V nD'\ = \v\ 

V and V + a being disjoint, we conclude that \(V U (V + a)) n -D'| = |V|. Since the sets 

V U (V + a) and J7 U (J7 + a) are disjoint, we finally get 

\(UU(U + a))nD'\ + \(VU(V + a))nD'\ < d 

\(UU(U + a))nD'\ < d-\V\ (40) 

A relation concerning U and analogous to (39) can be established: 

(U + a) n D' = (U n D') + a (41) 

and can be used to conclude that U H D' uniquely determines (U + a) n D'. 

Relation (41), together with the fact that U and U + a are disjoint, leads to the conclusion 
that 

l^nD'l = \(U + a)nD'\ 

= ^ |(^U(^ + a))nD'| 

\UnD'\ < (by (40)) 

Since U PI -D' is a set containing less than (d — |V|)/2 elements chosen among the a — \V\ 
elements of U, it can take at most 



E 

0<fc<(e2-|V|)/2 



a - \V\ 
k 



different values; the same is true concerning (U U (U + a)) Pi D' (since this set is determined 
in a unique way by U PI D'). 
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From the fact that 

D' = {(U U(U + a)) |~| D') U ((V U (V + a)) H D') 
we finally deduce that D' can take no more than 



0<fc<(e2-|V|)/2 



a - \V\ 
k 



2^ 



different values; then 



d,a,2,P 



< a 



a - \V\ 
l(d-\V\)/2\ 



2^ 



(42) 



We can verify the following relations (remember that | V \ < d) 



d-\V, 
0 < — — '- < 



0 < n 



2 

a 1 

a - \V\ < 2 



1 a 

o " ^ IT?I 

2 a — \ V\ 



(a-\V\)-l 



which, together with (42), enable us to use Lemma 4 and obtain 



< ae " M3 rf^F (a " |y|) 2 a -l y l2l y l 



'd,a,2,P 

from that we deduce that (37) holds and we thus end the verification for b = 2. 

Now, we suppose that b > 3 and that the lemma holds for V - b — 1 . Supposing that a, d, 
fi and P satisfy the lemma's hypotheses, let us establish relation (37). Let D' e V' d a b p . We 
can split D' into the union of two disjoint subsets Di and Q : 

Di = D'nl 
Q = D'nJ 

By definition of V' d a b p , for every ie [0 .. a) we have 

\D' n (aZ + i)\ + \P n (aZ + z)| e 2N 

this can be rewritten as 

\D 1 n (aZ + z)| + \Q n (aZ + z)| + |P n (aZ + z)| g 2N 

and, by Lemma 6, 

|Di n (aZ + *)| + K-PAQ) n (aZ + *)| e 2N (43) 
The facts that D\ C I and that \D\ \ + \ Q \ = \D'\, together with relation (43), enable us to state 

"^1 e D'd-\Q\,a,b-l,PAQ 
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We have therefore established that every D' e V' d a b p is the union of some Q C J and some 

D l e P d-|Q|,a,i,-l,PAQ" Then ' 



C |J { ^l U ^ I ^1 6 ^-|Q|,a,6-l,PAQ } 
^d,a,b,P < X] |^d-|(3|,a,i>-l,PAQ 



(44) 



Let us split the sum (44) into two terms X and Y : 



d,a,b,P 



< X + Y 



X = 



Y = 



E 



QcJ 

|Q|<(l/2- M )a-l 

E 

QCJ 

|Q|>(l/2- M )a-l 



d-|Q|,a,b-l,PAQ 



V' 



d-\Q\,a,b-l,PAQ 



The sum X is indexed by subsets of J having less than (1/2 — fi)a — 1 elements. Lemma 4 
implies then that the number of terms in the sum is less than or equal to 

£ (?) 

0<i<(l/2-/i)a-l 



3 

-fj. a r^a 



< ae 

From Lemma 10, we deduce that each term in X is less than or equal to 2 ab ~ 2a ; therefore, 

X < ae-> J?a 2 ah - a (45) 
The sum Y, being indexed by subsets of J, contains at most 2 a terms. Each term is of the form 



where 



d-|Q|,a,b-l,PAQ 



d-\Q\< (1/2- fj,)a(b- l)-(b- 1) 



After straightforward verifications, the induction hypothesis (Lemma 11 applied for 6-1) 
may be applied to give 



V' 



< ae -^ 3 a 2 a ( b - l )+(b-i)-a 



Therefore, 
and 



d-\Q\,a,b-l,PAQ 

Y < ae~^ 3a 2 ab+( - b ~^~ a 



(46) 



d,a,b,P 



V' 



d,a,b,P 



< ae~ tl3a 2 ab - a + ae~ tl3a 2 abHb - l) - a 

< ae- ,J?a 2 ab+b - a 
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□ 

The definition of Vg !P , together with Lemma 8, imply that Vg !P C V _j_ ; if we set 
fi - fi'(l), Lemma 1 1 implies 

\V g , p \ < l e -^ 3(ln ^2 l+ ^- lnp (47) 
which is our second upper bound on Vg !P . 

6.9 Conclusion 

Let us use the two bounds (35) and (47) to estimate the sum described in (33). For 
I n p < jfj , we have (by (35)) 



\Vs, P \2 lnp < I e-" ^ 1 2 l+lnp 

\Vg,p\2 lnp < Ze-"' ( ' )3 '2' + nfr (48) 
For I n p > j^j, we use (47), which implies, 

\Vs !P \2 lnp < l e -"' <l ^ ln ^2 l+ ^ 

\Vg,p\2 lnp < Z 2 e -^'W 3 hfr 2 Z (49) 
For every term in the sum (33), either (48) or (49) holds. Therefore, 

1//2J 

|^| < ^max(Ze-^^2' + nfr , fe^'^^2 1 ) 
P =\ 

\E\ < max(z 2 e-^'^2' + nfr , Z 3 e^'^nfr 2 ') (50) 
From (50), using the definition of fi', we get (after a tedious computation) relation (21). □ 

6.10 The proof of Capital Lemma 2 

Let us describe the modifications that the proof of Capital Lemma 1 (Sections 6.6-6.9) 
should undergo in order to become a proof of Capital Lemma 2. Note that the function e used 
in both proofs is the same. 

By analogy with the objects S and E (see (19) and (20)), we define 

6 = up (f) + /£(/) 

E = { 5e {0,1}' | d'(S)>8 } (51) 

The property to be proven (corresponding with (21)) can then be expressed by the relation 
(analogous to (21)) 

E < ] --^-2 l (52) 
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By analogy with (34), we get 



E 



L'/2J 

< E E i^i 

P=l |D|>« 



By analogy with T>d, p (see (31)), we define for any d e N, 

2^ = { D | |D| > / — d a Epj) ± 0 } 
Then, in the same way as relation (34) is obtained, we get 



< 



-*<a:<Z 



< 



0<x<6 



which, in turn, leads us to the first upper bound on Vg iP (analogous to (35)): 



< le 



-my 



(53) 



(54) 



(55) 



In order to obtain the second upper bound on Vg iP (analogous to (47)), we use Lemma 8 and 
get, foralHe [0../np), 



D g V StP 



Hp)Z + i) n D\ e 2Z 
|((/ n p)Z + i) n ([0 .. I) - D)\ + n p)Z + i) n [0 .. l)\ e 2Z (56) 



The definition of 1>s tP (formula (54)) implies that 



(57) 



D g V 6tP \[0..l) -D\<6 
From (56) and (57), and from Auxiliary Definition 10, we get 

%c{^[0../)|[0..0-^ n^o.,)} 

Finally, by observing that the function transforming D (for D C [0 . . I)) into [0 . . 1) — D is 
bijective, we obtain 



T>s, P 



V' 



6,ln PlJ fc,l0..l) 



and using Lemma 1 1, we get the second upper bound on 1>s tP (analogous to (47)): 

< i e -n'(l) 3 (ln P ) 2 l+ THj- ln P 



Dg !P 



(58) 



The two bounds (55) and (58) enable us to derive (52) in the same way as (21) is obtained in 
Section 6.9. □ 
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